Similar Thesaurus Based on Arabic Document: an Overview and Comparison

نویسنده

  • Essam S. Hanandeh
چکیده

The massive grow of the modern information retrieval system (IRS), especially in natural languages becomes more difficult. The search in Arabic languages, as natural language, is not good enough yet. This paper will try to build similar thesaurus based on Arabic language in two mechanisms, the first one is full word mechanisms and the other is stemmed mechanisms, and then to compare between them. The comparison made by this study proves that the similar thesaurus using stemmed mechanisms get more better results than using traditional in the same mechanisms and similar thesaurus improved more the recall and precision than traditional information retrieval system at recall and precision levels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

ارائه روشی برای استخراج کلمات کلیدی و وزن‌دهی کلمات برای بهبود طبقه‌بندی متون فارسی

Due to ever-increasing information expansion and existing huge amount of unstructured documents, usage of keywords plays a very important role in information retrieval. Because of a manually-extraction of keywords faces various challenges, their automated extraction seems inevitable. In this research, it has been tried to use a thesaurus, (a structured word-net) to automatically extract them. A...

متن کامل

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

Evaluation of Different Query Expansion Techniques for Arabic Text Retrieval System

The word mismatch problem is fundamental to Information retrieval. Query expansion process helps to overcome this problem. Based on the Arabic corpuses, the comparisons between two query expansion techniques (global and local query) have been conducted to determine the query effectiveness. First one represents the local context analysis which represents a local method, while a global method was...

متن کامل

The Effect of Combining Different Semantic Relations on Arabic Text Classification

A massive amount of documents are being posted online every minute. The task of document classification requires extensive background work on the content of documents, where keyword-based matching alone may not be sufficient. Much research has been carried out in several languages that has revealed significant results. However, Arabic documents still pose a great challenge due to the nature of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013